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TITLE OF THE INVENTION 



IMAGE ENCODING AND DECODING METHOD AND DEVICE 



BACKGROUND OF THE INVENTION 



Field of the Invention: 



This invention relates to an image encoding and 



decoding method, image encoding and decoding device, 
and more specifically, to a method of synthesizing 
interframe predicted images by calculating motion 
vectors of pixels in an image, by performing 
interpolation/extrapolation of motion vectors of 
representative points. 

Background of the Invention: 

In high efficiency encoding of a moving image, 
it is known that interframe prediction (motion 
compensation) which uses similarities between frames 
produced at different times has a major effect on data 
compression. The motion compensating system that has 
become the mainstream of current image encoding 
technique is the block matching scheme adopted in H.261, 
MPEG 1 and MPEG 2 which are the international standards 
for moving image coding. In this system, the image to 
be encoded is divided into a large number of blocks, 
and motion vectors are calculated for each block. 

Block matching is currently the most widely 
used compensation technique, but when the whole image 



is enlarged, reduced or rotated, motion vectors have to 
be transmitted for all blocks so the coding efficiency 
is poor. To deal with this problem, global motion 
compensation has been proposed wherein the motion 
5 vectors of the whole image are represented by a smaller 
number of parameters (e.g. M. Hotter, " Differential 
estimation of the global motion parameters zoom and 
pan", Signal Processing, vol. 16, no. 3, pp. 249-265, 
Mar. 1989). In this system, the motion vector (ug(x,y), 
10 vg(x,y)) of a pixel (x, y) is expressed in the form: 

u g (x, y) = a 0 x + &\y + ^2 
v g (x,y) = a 3 x + a 4 y + a 5 

(1) 

or 

15 

u g (x,y) = b 0 xy + b i x + b 2 y + b 3 
v s (x,y) = b 4 xy + b 5 x + b 6 y + b 7 
(2) 

and motion compensation is performed using this motion 
vector. Herein, a0-a5, b0-b7 are the motion parameters. 
20 When motion compensation is performed, the predicted 

image on the transmitting side and . receiving side must 
be the same. For this purpose, the transmitting side 
can transmit the values of a0-a5 and b0-b7 directly to 
the transmitting side, however the motion vectors of 
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plural representative points may be transmitted instead 
Assume that the coordinates of pixels at the upper left 
upper right, lower left and lower right corners of an 
image are respectively (0,0), (r,0), (0,s), (r,s) where 
r and s are positive integers. If the horizontal and 
vertical components of the motion vectors of the 
representative points (0,0), (r,0), (0,s) are 
respectively (ua,va), (ub,vb), (uc,vc), equation (1) 
may be rewritten as: 



u g {x,y) = -± a -x + - s - a -y + u a 

8 r s 

v. — V v — V 
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This means that the same functions can be 
achieved by transmitting ua, va, ub, vb, uc, vc, 
instead of a0-a5. In the same way, using the 
horizontal and vertical components (ua,va), (ub,vb), 
(uc,vc), (ud/Vd) of the motion vectors of the four 
representative points (0,0), (r,0), (0,s), (r,s), 
equation (2) may be rewritten as: 



s-yfr-x x \ yfr-x x ^ 



(4) 
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Therefore, the same functions can be achieved 
by transmitting ua, va, ub, vb, uc, vc, ud, vd instead 
of b0-b7. This situation is shown by Fig, 1. If 
global motion compensation is performed between an 
original image 102 and a reference image 101 in the 
current frame, motion vectors 107, 108, 109, 110 of 
representative points 103, 104, 105, 106 (wherein 
motion, vectors are defined as starting at points in the 
original image and terminating at corresponding points 
in the reference image of the current frame) , may be 
transmitted instead of the motion parameters. In this 
specification, the system using equation (1) is 
referred to as global motion compensation based on 
linear interpolation/extrapolation, and the system 
using equation (2) is referred to as global motion 
compensation based on bilinear 
interpolation/extrapolation . 

Warping prediction is the application of this 
global movement compensation processing to a smaller 
area of the image. An example of warping prediction 
using bilinear interpolation/extrapolation is shown in 



Fig. 2. Fig. 2 shows processing to synthesize a 
predicted image of an original image 202 of a current 
frame using a reference image 201. First, the original 
frame image 202 is divided into plural polygon-shaped 
patches to give a patched image 209. The apex of a 
patch is referred to as a lattice point, and each 
lattice point is common to plural patches. For example, 
a patch 210 Comprises lattice points 211, 212, 213, 214, 
and these lattice points are also the apices of other 
patches. After dividing the image into plural patches 
in this way, motion estimation is performed. In the 
example shown here, motion estimation is performed on 
lattice points using the reference image 201. As a 
result, each patch in a reference image 203 obtained 
after motion estimation is transformed. For example, 
the patch 210 corresponds to a transformed patch 204. 
This is because, in motion estimation, it was estimated 
that the lattice points 205, 206, 207, 208 corresponded 
to 211, 212, 213, 214. 

The motion vectors of the lattice points are 
thereby found, and an interframe predicted image is 
synthesized by calculating the motion vector for each 
pixel in the patch by bilinear interpolation. The 
processing of this warping prediction is basically the 
same as the global motion compensation shown in Fig. 1, 
and the "motion vectors at the corners of the image" 
are transformed into "motion vectors of the lattice 
points". If a triangular patch is used instead of a 




rectangular patch, warping prediction may be realized 
by linear interpolation /extrapolation . 

Examples of encoding and decoding techniques to 
simplify global motion compensation for representing 
5 motion vectors of the whole image using a small number 
of parameters may be found in the applicant's 
inventions "Image Encoding and Decoding Methods" 
(Japanese Unexamined Patent Publication: Application No. 
Hei 8-60572) and "Methods of Synthesizing Interframe 

10 predicted images" (Japanese Unexamined Patent 
Publication: Application No. Hei 8-249601) . 

By introducing global motion compensation or 
warping prediction described above, the motion of the 
image can be expressed with fewer parameters, and a 

15 high data compression rate can be achieved. However, 
the processing amount in encoding and decoding is 
greater than in the conventional method. In particular, 
the divisions of equations (3) and (4) are major 
factors in making the processing complex. In other 

20 words, in global motion compensation or warping 

prediction, a problem arises in that the processing 
amount required to synthesize predicted images is large. 
SUMMARY OF THE INVENTION 

It is therefore an object of this invention to 

25 reduce the computing amount by replacing the divisions 
involved in motion compensation encoding and decoding 
by a binary shift computation using registers with a 
small number of bits. 
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Qj&l^iy ^i or d er to achieve the aforesaid objective, 
this invention provides an image encoding and decoding 
method for synthesizing an interframe predicted image 
by global motion compensation or warping prediction 
wherein global Votion vectors are found by applying a 
two-stage interpolation/extrapolation to motion vectors 
of plural representative points having a spatial 
interval with a characteristic feature. More 
specifically, this indention provides a method of 
synthesizing an interfraime predicted image wherein, 
when the motion vector oi\a pixel is calculated by 
performing bilinear interpolation/extrapolation on 
motion vectors of four representative points of an 
image where the pixel sampling, interval in both the 
horizontal and vertical directions is 1 and the 
horizontal and vertical coordinates of the sampling 
points are obtained by adding toin\egers (where w=wn/wd, 
wn is a non-negative integer, wd isVhe hw power of 2, 
hw is a non-negative integer and wn<wcl) , there are 
representative points at coordinates (i\j), (i+p, j), 
(i, j+q), (i+p, j+q) (where i, j , p, q atte integers), 
the horizontal and vertical components of Vhe motion 
vectors of representative points take the vklues of 
integral multiples of 1/k (where k is the hk Y> ower of 2, 
and hk is a non-negative integer) , and when the motion 
vector of a pixel at the coordinates (x+w, y+w)\is 
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found, the horizontal and vertical components of the 
motion vector at the coordinates (<x+w7 j? are found by 
linear interpolation/extrapolation of motion vectors of 
representative points at coordinates (i, j), (i+p, j ) , 
as values which are respectively integral multiples of 
1/z (where z is the hz power of 2, and hz is a non- 
negative integer) , and after finding the horizontal and 
vertical components of the motion vector at the 
cooidinates (x+w, j+q) by linear 

interpolation/extrapolation of motion vectors of 
representative points at coordinates (i, j+q), (i+P/ 
j+q) , as values which are respectively integral 
multiples of 1/z (where z is the hz power of 2, and hz 
is c. non-negative integer) , the horizontal and vertical 
components of the motion vector of the pixel at the 
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rdinates (x+w, y+w) are found by linear 
erpolation/extrapolation of the— a-f e-re-s-add two motion 
tors at the coordinates ((x+w, j), (x+w, j+p) as u 



lies which are respectively integral multiples of 1/m 
=re m is the hm power of 2, and hm is a non-negative 
ger) . 

This invention makes it possible to perform 
divisions by means of shift computations by 
appropriately selecting representative point 
coordinates, and to implement the aforesaid motion 
compensation scheme using registers having a small 
number of bits by reducing the number of shift bits in 
the shift computations. 




BRIEF DESCRIPTION OF THE DRAWINGS: 

Fig. 1 is a diagram showing an example of 
global mtion compensation for transmitting motion 
5 vectors of representative points. 

Fig. 2 is a diagram showing an example of 
warping prediction. 

Fig. 3 is a diagram showing an example of the 
position of representative points for performing high 
J" 10 speed processing. 

jVj Fig. 4 is a diagram showing a typical 

~J construction of a software image encoding device. 

y i 

P) Fig. 5 is a diagram showing a typical 

^ construction of the software image decoding device. 

jj'f 15 Fig. 6 is a diagram showing a typical 

a: ? 

f|I construction of an image encoding device according to 

jjj this invention. 

Fig. 7 is a diagram showing a typical 
construction of the image encoding device according to 
20 this invention. 

Fig. 8 is a diagram showing a typical 
construction of a motion compensation processor 616 of 
Fig. 6. 

Fig. 9 is a diagram showing another typical 
25 construction of the motion compensation processor 616 
of Fig. 6. 
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Fig. 10 is a diagram showing a typical 
construction of a predicted image synthesizer 711 of 
Fig. 7. 

Fig. 11 is a diagram showing a typical 
construction of a predicted image synthesizer 1103 of 
Fig. 9. 

Fig. 12 is a diagram showing a typical 
construction of a global motion compensation predicted 
image synthesizer . 

Fig. 13 is a diagram showing an example of a 
processing flowchart in the software image encoding 
device . 

Fig. 14 is a diagram showing an example of a 
motion compensation processing flowchart in the 
software image encoding device. 

Fig. 15 is a diagram showing an example of a 
processing flowchart in the software image decoding 
device . 

Fig. 16 is a diagram showing an example of a 
predicted image synthesis flowchart in the software 
image decoding device. 

Fig. 17 is a diagram showing a specific example 
of a device using image encoding/decoding which 
synthesizes a global motion compensation predicted 
image by two-stage processing. 



Preferred Embodiments of the Invention: 




This invention is an application of an 
invention relating to a method of accelerating the 
computation involved in global motion compensation and 
warping prediction already proposed by the applicant 
5 (Application No. Hei 08-060572 and Application No. Hei 
08-249601) . This invention will be described in the 
context of its application to global motion 
compensation, but it may also be applied to warping 
E! prediction wherein identical processing to that of 

i; 10 global motion compensation is performed. 
f } \ In the following description, it will be 

£! assumed that the pixel sampling interval is 1 in both 

if s I 
« ! ? 

gl the horizontal and vertical directions and a pixel 

[ exists at a point where horizontal and vertical 

^ 15 coordinate is obtained by adding w (where w=wn/wd., wn 

iL. : 

fit is an integer which cannot be negative, wd is a 

/!;: positive integer and wn<wd) to integers w represents 

the coordinate phase shift the coordinates of a 
representative point in global motion compensation, and 
20 a pixel. Typically it has the values 0, 1/2, 1/4. 

Further, it will be assumed that the numbers of pixels 
of the image in the horizontal and vertical directions 
are respectively r and s (where r and s are positive 
integers) , and image pixels lie in a range such that 
25 the horizontal coordinate is from 0 to less than r, and 
the vertical coordinate is from 0 to less than s. 

When motion compensation is performed using 
linear interpolation/ extrapolation (af f ine 
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transformation) or bilinear interpolation /extrapolation 
(co-lst order transformation) , and quantization is 
performed on the motion vector of each pixel , 
mismatches are prevented and computations are 
5 simplified (Japanese Unexamined Patent Publication: 

Application No. Hei 06-193970) . Hereafter, it will be 
assumed that the horizontal component and vertical 
component of the motion vector of a pixel are integral 
f X multiples 1/m (where m is a positive integer) . It will 

10 moreover be assumed that the global motion compensation 

4" used for motion vectors of representative points 

LI 1 

?l described in "Background of the Invention" will be 

Jrj! performed, and that the motion vectors of 

B representative points are integral multiples of 1/k 

jp : * 

pi I 15 (where k is a positive integer) . In this specification, 

J;;? the term "motion vector of a pixel" implies a motion 

y|l vector used in order to actually synthesize a predicted 

image when performing global motion compensation. 

On the other hand, the term "motion vector of a 
20 representative point" means a parameter used to 

calculate a motion vector of pixel. Therefore, it may 
occur that the motion vector of a pixel and the motion 
vector of a representative point do not coincide even 
if they are located at the same coordinates due to 
25 differences of quantization step size, etc. 

First, global motion compensation using linear 
interpolation/extrapolation will be described referring 
to Fig. 3. .In this example, instead of taking a 
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representative point situated at the corner 301 of the 
image, representative points 302, 303, 304, 305 are 
generalized at (i, j) (i+p, j) (i, j+q) (i+p, j+q) 
(where i, j, p, q are integers). The points 302, 303, 
. 5 304, 305 may be situated inside or outside the image. 

If the horizontal and vertical components of the motion 
vectors of representative points multiplied by k are 
respectively (u0,v0), (ul,vl), (u2,v2), (u3,v3) (where 
£1 uO, vO, ul, vl, u2, v2, u3, v3 are integers), the 

10 values obtained by multiplying the horizontal and 

4* vertical components of a motion vector of a pixel 

til 

t\ situated at (x+w, y+w) by m, i.e., (u(x+w, y+w) , v(x+w, 

y+w) ) , may be expressed by the following equation when 
s w=0 : 

hi is 

S I - 

pj J u(x + w, y + w) = u(x, y) 

y; I = (((/ + q- yW + p - *K + (* - 0«i ) 

^ +(y - J'W + P~ x)u 2 + (x - i)u 3 ))m) i /(pqk) 

v(x + w,y + w) = v(x,y) 

= ((U + q- yW + p- x >o + (x - /)v, ) 

Hy ~ JW + P~ x >i + ( x ~ 0v 3 ))m) I Kpqk) 
(5) 

where x, y, u(x,y), v(x,y) are integers, 
[//] is a division which rounds up the result of an 
20 ordinary computation to the nearest integer when the 

result is not an integer, and the order of priority as 
an operator is equivalent to multiplication and 
division. It is desirable to round off non-integral 
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values to the nearest integer so as to minimize 
computing errors. In this regard, there are three 
methods of rounding up the sum of 1/2 and an integer, 
i.e.: 

5 (1) round the value down to the next lowest integer, 

(2) round the value up to the next highest integer, 

(3) round the value down when the dividend is negative, 
and round the value up when the dividend is positive 
(assuming the divisor is always positive) . 

10 (4) round the value up when the dividend is negative, 
and round the value down when the dividend is positive 
(assuming the divisor is always positive) . 
I: In (3) and (4), the direction of rounding does 

£ not change depending on whether the dividend is 

fjj 15 positive or negative, and these methods therefore offer 
an advantage from the viewpoint of processing amount to 
the extent that it is unnecessary to determine positive 
or negative. High speed processing using (3) can be 
performed using for example the following equation (6). 

20 

u(x + + w) = u(x, y) 

= (Lpqk + ((j + q- y)((i + p- x)u 0 + (x - i)u } ) 
+(y - J)((i + P - x)u 2 + (x - i)u 3 ))m + ((pqk)#2)) 
#(pqk)-L 
v(x + m^j + w) = v(x, y) 

= (Mpqk + (U + q-yW + p-x)v 0 +(x-i)v ] ) 
Hy ~ + P ~ x)v 2 + (x - z)v 3 ))m + ((pqk)#2)) 
#(pqk)- M 
(6) 
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where "#" is an integer division wherein digits after 
the decimal point are discarded, and the order of 
priority of computation is assumed to be the same as 
5 that of multiplication and division. Generally, this is 
the type of division that can be realized most easily 
by a computer. L and M are numbers for ensuring that 
the dividend is always positive, and are positive 
integers which are sufficiently large. The term 

10 (pqk#2) is used to round off the division result to the 

nearest integer. 

Processing in terms of integers in itself 
contributes to reducing the amount of processing. 
However, if p, q, k are respectively set equal to the 

15 a , 0 , hk power of 2 (where a, (3 , hk are non-negative 
integers), the calculation of equation (5) can be 
performed by a shift computation of a + jS+hk bits, and 
the amount of processing performed by the computer and 
special hardware can be largely reduced. If m is set 

20 equal to the hm power of 2 (where hm is a non-negative 
integer, and hm<0! + j3+hk, equation (6) may be written • 
as : 
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u(x + w,y + w) = y) 

= {{2L + \m<* + P + h k -h n -\) 

+(y + 4 - yW +p- *K + (x - /)«, ) 

Hy - + P~ *>i + O - i)u 2 )) 
)){a + /3+h k -h m )-L 
v(x + w,y + w) = v(x, y) 

= ((2M + l)(((a + /3+h k -h m -\) 

+(7 + - J0(0' + /> - *) v o + (x - *>, ) 
+(^-y)(tf + P-^)v 2 +(x-/)v 3 )) 
»(« + >9+^»/z /; ,)-M 

(7) 

((x<<or) means that x is shifted by a bits to the left, 
and 0 is set in the lower a bits. (x»a) means that x 
is shifted by Of bits to the right, and 0 is set in the 
upper a, bits. The order of priority of these operators 
is intermediate between addition/subtraction and 
multiplication/division) . Therefore the number of 

shift bits may be written as a + j3 -f hk-hm. 

When w is not 0, according to the definition 
w=wn/wd, equation (5) may be rewritten by the following 
equation (8 ) : 



u(x + w,y + w) = u(x + — -,y + — ) 

w w 



+( Wrf^ + W n ~ )(( + - W rf X " W„ )U 2 

//(w/pgk) 



v(x + w,y + w)= v(x + — + — ) 



= ((O f /7 + V>A ~ - ™n + ™dP - W d X - W n K 

+(w rf x + w„ -wj)v x ) 
+{w d y + w„ - w d j)dwj + w rf /? - m^x - w n )v 2 

+(w rf x + w„ - w rf i)v 3 ))w) 



.(8) 



If wd is the hw power of 2 and hw is a non- 
negative integer, the division by (p . q . k . wd . wd) becomes 
a shift computation of a + /3+hk+2hw bits, and the 
division may therefore be replaced by a shift 
computation as in the case of w=0. Also as in the case 
of equation (7) , if hm< a + (3 +hk+2hw, the number of 
shift bits can be reduced to a + /3 +hk+2hw-hm bits by 
dividing both the numerator and denominator by m. 
Therefore, provided that Wd is a hw power of 2, the 
processing when w=0 and when w^O is basically the same 
Hereafter, although the equations are somewhat complex 
the case w*0 will be described. To find the 
calculation results for w=0, the substitutions wn=0, 
wd=l, hw=0 may be made. 




To obtain the same global motion compensation 
predicted image on the transmitting and receiving sides, 
information about motion vectors of representative 
points must be transmitted to the receiving side in 
5 some form or other. In one method, the motion vectors 
of representative points are transmitted without 
modification, and in another method, the motion vectors 
of the corners of the image are transmitted and the 
motion vectors of representative points are calculated 
10 from these values. Hereafter, the latter method is 
described . 

Assume that the motion vectors of the - four 
corners (-c, -c) , (r-c, -c) (-c, s-c) , (r-c, s-c) of 
the image can be only integral multiples of 1/n (where 

15 n is a positive integer, c=cn/cd, cn is a non-negative 
integer, cd is a positive integer and cn<cd) , and that 
(uOO, vOO), (uOl, vOl), (u02, v02), (u03,.v03) which 
are the horizontal and vertical components of these 
vectors multiplied by n, are transmitted as global 

20 motion parameters. c represents a phase shift between 
the corners and representative points, and it typically 
has a value of 0, 1/2 or 1/4. (uO, vO), (ul, vl), (u2, 
v2), and(u3, v3) which are the horizontal and vertical 
components of the motion vectors respectively at the 

25 points (i, j), (i+p, j ) , (i, j+q),and (i+p, j+q) 
multiplied by k, may be defined as: 
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"o = "'(*»./') 
v 0 = v'(i,j) 
«, = «'(/' + p,j) 

v, = v'(i + Pyj) 
u 2 = u'(i,j + q) 
v 2 =v'(ij + g) 
u 3 = u'(i + p,j + q) 
v 3 = v'(i + pj + q) 
(9) 

L.I 

jlj where u' (x, y) , v f (x, y) are defined by transforming 

*;.! equation (5) : 

Li I 

a 5 

y i 

p k'(x, j/) = (((c d s - c„ - c d y)((c d r - c„ - c rf x)w 00 + {c d x + c n )w 01 

I . Hc d y + c„ )((c d r -c„- c d x)u 02 + {c d x + c„ )w 03 ))k) 

EJ ///(c/rsn) 

fjj v'(x,y) = (((c d s- c„ - c d y)((c d r - c„ - c d x) Voo + (c d x + c„ )v 01 ) 
* \ Hc d y + c n )((c rf r - c„ - c^x)v 02 + (c d x + c n )v 03 ))*) 
%) 1 1 l{c d rsri) 
(10) 

[///] is a division wherein the computation 
result is rounded to the nearest integer when the 

10 result of an ordinary division is not an integer, and 
the order of priority is equivalent to multiplication 
and division. In this way, if (uO, vO), (ul, vl), (u2/ 
v2) , and (u3, v3) are calculated and global motion 
compensation is performed at the representative points 

15 (i, j), (i+p, j), (i, j+q), and (i+p, j+q), global 
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motion compensation at the representative points (-c, - 
c) , (r-c, -c) (-c, s-c) , and (r-c, s-c) can be 
approximated. As described hereabove, if p and q are 
non-negative integral powers of 2, the processing can 
5 be simplified. In general, it is preferable not to 

perform extrapolation when calculating motion vectors 
of pixels in the image by equation (5) . This is to 
avoid increase of quantization error in motion vectors 
of representative points due to extrapolation. For 

10 this reason, it is desirable that all the 

representative points are in such positions that they 
surround the pixels in the image. Hence when i=j=c=0, 
it is appropriate that p and q are effectively the same 
as, or have slightly larger values than, r and s. Care 

15 must however be exercised as if the values of p and q 
are too large, the number of bits required for the 
calculation increases . 

To reduce computational error in the processing 
of equations (9) and (10), it is preferable that [///] 

20 rounds off non-integral values to the nearest integer. 
In this regard, the sum of 1/2 and an integer may be 
rounded off to the nearest integer by any of the 
aforesaid methods (l)-(4). However compared to the 
case of equation (5) (calculation performed for each 

25 pixel), equation (14) requires fewer computations (only 
four calculations for one image) , so even when the 
methods of equations (1) or (2) are chosen, there is 
not much effect on the total computational amount. 
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If the values of p and q are set to non- 
negative integral powers of 2 as described in the above 
examples, the synthesis of interframe predicted images 
in global motion compensation is greatly simplified. 
5 However, there is still one other problem. Considering 
for example the case p=512, q=512, k=32, m=16, wd=2, 
wn=l (w=0.5), which are typical parameters in image 
coding, we have Ot + j3 +hk+2hw-hm=21 . This means that 
when u(x+w, y+w) is a value requiring 12 or more bits 

10 in binary form, a register of at least 33 bits is 
required to perform the high speed computation of 
equation (8) . When for example m=16, the value of 
u(x+w, y+w) is obtained by multiplying the horizontal 
component of the real motion vector by 16, so this 

15 could well be a value requiring 12 or more bits in 

binary form. At the present time, few processors have 
registers capable of storing integers of 33 or more 
bits, and they are expected to remain costly in future. 
Moreover in general, if the processor circuit is large, 

20 power consumption is correspondingly greater, so an 
algorithm requiring a large register is also 
disadvantageous from the viewpoint of power consumption. 
Therefore it is desirable that even when the division 
can be replaced by a shift computation, the number of 

25 shift bits is as small as possible. 

To resolve this problem, the two-step algorithm 
according to this invention which is described below 
may be used. Prior to calculating the motion vector of 




the pixel at the point (x+w, y+w) using the motion 
vectors of the representative points (i, j), (i+P/ j), 
(i, j+q)f (i+P#- j+q)/ motion vectors at provisional 
representative points (i, y+w) and (i+p, y+w) are 
5 calculated so that the horizontal and vertical 

components are integral multiples of 1/z (where z is a 
positive integer) . As in the aforesaid example, the 
horizontal and vertical components of the motion 
vectors of the representative points (i, j), (i+P/ j ) , 

10 (i, j+q), (i+P, j+q) multiplied by k are taken to be 
respectively (uO, vO) , (ul, vl), (u2, v2 ) , (u3, v3) 
(where uO, vO, ul, vl, u2, v2, u3, v3 ; are integers) . 
If the provisional representative points are situated 
at (i, y+w) and (i+p, y+w), (uL(y+w), vL(y+w)) and 

15 (uR(y+w),. vR(y+w)) which are the horizontal and 

vertical components of the motion vectors of these 
provisional representative points multiplied by z, are 
defined as follows: 

u L {y + w) 

= (CO,/ + w d q - w d y - w„ )w 0 + (w d y + w„ - w d j)u 2 )z) 1 1 1 l{w d qk) 
v L (y + w) 

= ((0</7 + Wj<l ~ ~ W n K + ( W <iy + W n - ™J>2 ) z ) 1 1 1 K^ k ) 

20 u R (y + w) 

= + ™ d q - ™jy - w n ) w i + ( w <*y + w n - wj)*h » n/ A>M) 

= ((( + w d q - w d y - w n )v, + (w d y + w n - w d j)v 3 )z) 1 1 1 /(w d qk) 
(ID 



2 2 



[////] is a division which rounds up the result 
of an ordinary computation to the nearest integer when 
the result is not an integer, and the order of priority 
is equivalent to multiplication and division. (The 
5 required function for [////] is the same as the [///] 
described above) . As (i, y+w) lies on a line joining 
(i, j) and (i, j+q) , (uL(y+w), vL(y+w) ) can easily be 
found by a first order linear 

intrapolation/extrapolation using (uO, vO) and (u2, v2 ) . 
10 Likewise, as (i+p, y+w) lies on a line joining (i+p, j) 
and (i+p, j+q) may also be found by a first order 
linear interpolation/extrapolation . 

\ Bv performing another first/^rder linear 
interpolation/extrapolation on th^s motion vectors 
15 (uL(y+w), vL(y+w)) and (uR(y+w)/ vR(y+w) ) of the 

provisional representative points found as described 
above, (u(x+w, y+w), vtx+w/y+w) ) which are the 
horizontal and vertical oromponents of the motion vector 
of the pixel at (x+w, y+w) multiplied by m, are found. 
20 This processing is nerformed by the following equation: 



u(x + w, y + w) = {{{wf + w d p - w d x - w n )u L {y + w) 

+(w/x + w n - w d i)u R {y + w))m) I /(w d pz) 
v(x + w, y + w) + w d p - w d x - w n )v L (y + w) 

(w d x + w„ - w d i)v R (y + w))m) I l(w d pz) 

•02). ' 
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In the same way as described above, if p is the 
a power of 2, m is the hm power of 2, z is the hz 
power of 2, wd is the hw power of 2 (where a, hm, hz, 
ws are non-negative integers), the division by p.z.wd 
5 in equation (12) may be replaced by a+hz+hwhm bit 

right shift (where hm< a +hz+hw) . Also, if z = 16 (hz=4), 
and the typical parameters p=512, q=512, k=32, m=16, 
wd=2, wn=l (w=0.5) are used, the number of shift bits 
is 10, so the number of bits required for the register 

10 used in the computation can be largely reduced. It may 
be noted that in the above example, a motion vector is 
found by performing a first order linear 
interpolation/extrapolation in the vertical direction 
on the motion vector of a representative point, and the 

15 motion vector of a pixel is then found by performing a 
first order linear interpolation/extrapolation in the 
horizontal direction on the motion vector of this 
representative point. Conversely, the same result may 
be obtained by performing a first order linear 

20 interpolation/extrapolation in the horizontal direction 
when the motion vector of a representative point is 
found, and in the vertical direction when the motion 
vector of a pixel is found. 

In this scheme, the two steps of equations (11) 

25 and (12) are required to find a motion vector of a 
pixel, and at first sight it might appear that this 
would lead to an increase of computation amount. 
However if the motion vector of a provisional 
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representative point is first found, this may be used 
for all r pixels on a line having the vertical 
coordinate y+w, so the percentage of the total 
processing amount due to equation (11) is very small. 
Therefore, the advantage (i.e. a smaller number of 
registers) gained by the lesser number of bits 
outweighs the disadvantage of increased computation 
amount due to having to perform the calculation of 
equation ( 11 ) . 

After obtaining the values (u(x+w, y+w), v(x+w, 
y+w)), (u(x+w, y+w), v(x+w, y+w)) may be divided into 
integral parts (ul (x+w, y+w), vl(x+w, y+w)) and 
fractional parts (uF(x+w, y+w), vF(x+w, y+w)) by the 
following processing . 



«! O + >»>> y + w) = i( Lm + u ( x + w > y + W ))))K ) - L 

v / (x + wj + w) = (( Mm + v(x + w, y + w))))h m ) - M 
(13) 



u F (x + wj + w) = u(x + w,y + w)- Uj(x + w 9 y + w)m 
v F (x + w, y + w) = v(x + w, y + w) - (x + w, y + h>>? 
(14) 



where ul (x+w, y+w), vl (x+w, y+w) are integers 
expressing the integral parts of a motion vector of a 
pixel. uF(x+w, y+w), vF(x+w, y+w) are integers both 
having values from 0 to less than m expressing m times 
the fractional parts of a motion vector of a pixel. As 



in the above example, m is the hm power of 2 (where hm 
is a non-negative integer) , and L and M are 
sufficiently large to make the shift values non- 
negative . 

When bilinear order interpolation is used as a 
method of interpolating luminance value, the luminance 
value of a pixel in the interframe predicted image can 
also be found by the following processing. When x ! = 
x+w+ul (x+w, y+w) , y T = y+w+vl (x+w, y+w), if the 
luminance values of pixels at (x ! , y f ), (x'+l, y 1 ), (x 1 
y f +l), (x'+l, y'+l) in the reference image are Ya, Yb, 
Yc, Yd, the luminance value Y (x+w, y+w) of a pixel at 
the point (x+w, y+w) in the interframe predicted image 
may be found by: 

Y(x + w, y + w) = {{m - v F ){{m - u F )Y a + u F Y b ) 

+v F {{m - u F )Y C + u F Y ci ) + (m 2 »l))»(2/z„, ) 

(15) 

where uF, vF are respectively abbreviations for uF(x+w, 
y+w) , vF(x+w, y+w) . 

In equations (12) and (13), QJ+hz+hw-hm bit and 
hm bit right shifts are respectively performed. This 
means that if an ( a +hz+hw-hm) +hm= a +hz+hw bit right 
shift is performed in the calculation of equation (10), 
ul (x+w, y+w) and vl (x+w, y+w) can be calculated in one 
step. It is convenient if a+hz+hw is an integral 
multiple of 8 because, in general, the size of a 



processor register is a multiple of 8 bit units. 
Frequently, two .8 bit registers (an upper bit register 
and a lower bit register) are linked to make a 16 bit 
register, or four 8 bit registers or two 16 bit 
5 registers are linked to make a 32 bit register. If the 
values of ul (x+w, y+w) , and vl (x+w, y+w) have already 
been calculated by, for example, a 16 bit shift 
computation, there is then no need to perform another 
shift computation. In other words, if the value prior 
10 to shift is stored in a 32 bit register, and the upper 
16 bits are used as a separate register, the values of 
ul (x+w, y+w) or vl (x+w, y+w) are stored in this 16 bit 
0! register. 

j, ls It will be appreciated that making the number 

15 of shift bits an integral multiple of 8 facilitates not 
flf only the processing of equation (10) but all aspects of 

J I the shift computation. It is particularly important to 

-it * 

make processing easier when a large number of shift 
computations has to be performed (e.g. a shift 

20 computation for each pixel) . Also, by first adding the 
same number of left shifts as the number of bits to the 
numerator and denominator, the number of right shifts 
due to division can be increased even when the number 
of shift bits is not an integral multiple of 8. For 

25 example, when a computation is performed by a 6 bit 

right shift, the same computation can be performed by 
an 8 bit right shift by first multiplying the value to 
which the shift operation is applied by 4 (this is 
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equivalent to performing a 2 bit left shift) . (Taking 
equation (5) which concerns u(x+w, y+w) as an example, 
this processing can be implemented by first multiplying 
. uO, ul, u2, u3 by 4). Care must however be taken when 
5 performing this processing that overflow of the value 
to be shifted does not occur. 

Many image encoders and decoders are able to 
adapt to a plurality of image sizes. In this case, . 
when for example global motion compensation is 
10 performed using equations (12), (13), (14), the number 
of shift bits changes according to change of image size 

St'. ■- 

and can no longer be fixed to integral multiples of 8. 
Of This can be treated as follows. For example, consider 

the situation where an a+hz+hw bit right shift is 
p[ 15 required to calculate ul (x+w, y+w) and vl (x+w, y+w) as 
vf described above, where a can have a value in the range 

"Hi 7 

y;l 7-11. If hz=5, hw=l when a is less than 10, and hz=4, 

hw=l when a =11, the number of shift bits can be 
arranged to be always 16 or less. As stated 

20 hereintof ore, when the number of shift bits is less 
than 16, it can be simulated to be 16 by first 
multiplying the value to be shifted by a constant. 
Hence, when the image size changes, the number of shift 
bits can be controlled to a convenient number by 

25 varying other parameters (e.g. quantization step size 
of motion vectors) accordingly. Care must be taken 
however not to make the quantization step size of the 
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motion vectors so large that it causes an appreciable 
degradation of the decoded image. 

When the algorithm shown in this specification 
is applied to ordinary global motion compensation, the 
5 motion vector of a representative point is first found 
to a precision of 1/k pixels using the motion vectors 
of the corners of the image which have a precision of 
1/n pixels. Next, the motion vector of a provisional 
□ representative point is found to a precision of 1/z 

1- 10 pixels using the motion vector of the representative 
4* point, and the motion vector of a pixel is found to a 

p precision of 1/m pixels using the motion vector of this 

f][ provisional representative point. When the motion 

£ vectors of the corners of the image are transmitted as 

s 

fjl 15 a motion parameter, it is desirable to make k as large 

?ii a value as possible in order to closely approximate 

%i bilinear interpolation/extrapolation by this parameter. 

However, the horizontal and vertical components of the 
motion vector of the representative point includes an 
20 error having an absolute value equal to. or less than 
l/(2k) due to the effect of quantization. From the 
viewpoint of making the approximation more accurate, it 
is preferable to increase, the precision of the motion 
vector also of the provisional representative point, 
25 however since the motion vector of the provisional 

representative point is found using the motion vector 
of the representative point, there is no advantage to 
be gained in calculating it with an accuracy greater 
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than that of the motion vector of the representative 
point. Therefore, it is preferable that z<k in order 
to suppress the number of bits required for the 
computation, and that m<z for the same reason. 

The above discussion has considered global 
motion compensation using bilinear 

interpolation/extrapolation, but the number of shift 
bits can be suppressed by introducing the same 
processing in the case of linear 

interpolation/extrapolation. For example, assume that 
the horizontal and vertical components of the jno tion 



vectors—of representative points at (i, j ) / (i+P/ j ) , 
and ((i, j+q)(where i, j , p, q are integerW)__multiplied 
by k/Sire~~ (uof vO) , (ul, vl), and (u2, v2 ) (where uO, 
vO, ul, vl, u2, v2 are integers) . The horizontal and 
vertical components of the motion vectors of a pixel 
(x+w, y+w) multiplied by m, i.e. (u(x+w, y+w), v(x+w, 
y+w) ) , can then be expressed as follows (where x, y, 
u(x+w, y+w), v(x+w, y+w) are integers and the 
definition of w is the same as above) . 



u(x + w, y +- w) = (((w, - u 0 )(w (i x + w n - w d i)q 

+(w 2 ~ w 0 )(m^ + w n - wj)p +- u 0 w d pg)m) 
1 1 (w d pqk) 

v(x + w, y -+ w) = ((( v, - v 0 ){w d x + w n - wj)q 

+(v 2 - v Q )(w d y + w„ - wj)p + v 0 w (J pq)m) 
//(w d pqk) 
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In this case also, p, q, k, m, wd are 
respectively Ot , ]3 , hk, hm and hw powers of 2 (where a, 
j3 , hk, hm and hw are non-negative integers), and if a> 
j3 , this equation may be rewritten as: 

5 

u + (x + x 9 y + x) = (((«, - wb)(w rf x + w„ - Wrf02 a "^ 

+(w 2 - w 0 )(m^ + w„ - + u Q w d p)m) I /(w d pk) 

v(x + + w) = (((v, - v 0 )(w rf x + w„ - w d i)2 a ' fi 

+0? - v 0 )(w rf ^ + w„ - + v 0 w d p)m) I l(w d pk) 

(17) 

As in the case when bilinear 
interpolation/extrapolation is used, the integer part 

10 of the motion vector of the pixel at (x+w, y+w) can be 
found by an a+hk+hw bit right shift, therefore if a 
+hk+hw is arranged to be an integral multiple of 8, 
processing can be simplified for the same reason as 
given above. It should be noted that when a < /3 , the 

15 number of shift bits is /3+hk+hw. 

The construction of the image encoding device 
and decoding device for performing image encoding and 
decoding according to this invention which uses the 
synthesis of interframe predicted images, will now be 

20 described. 

&V)C>A^ ^ig- 6 shows the configuration of one 
embodiment of an image encoding device according to 
this invenrsion. 
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The construction shown in the diagram is 
essentially the same as in the prior art encoding 
device excepting for the motion compensation processor 
616. 

A subtractor 602 calculates a difference 
between an input frame (original image of a current 
frame which is to be encoded) 601 and an output image 
613 (interframe predicted image) of an 

interf rame/intraf rame coding changeover switch 619, and 
outputs a differential image 603. This differential 
image is quantized by a quantizer Qo^/ after converting, 
to DCT coefficients by a DCT converter ^604^ so as to 
give quantized DCT coefficients 606. These quantized 
DCT coefficients are output as transmission information 
to a transmission path, and are also used to synthesize 
an interframe predicted image in the encoder. 

The procedure for synthesizing the frame 
predicted image will now- be described. 

The quantized DCT coefficients 606 pass through 
an inverse quantizer 608 and- inverse DCT converter 609 
so as to give a decoded differential image 610 (same 
image as the differential image reproduced on the 
receiving side) . The output image 613 of the 
interf rame/intraf rame coding changeover switch 619 
(described later) is added to this in an adder 611, and 
a decode imaged 612 of the current frame (same image as 
the decoded image of the current frame reproduced on 
the receiving side) is thus obtained. This image is 



first stored in a frame memory 614, and is delayed by 
the time for one frame. At this time, therefore, the 
frame memory 614 outputs a decoded image 615 of the 
immediately preceding frame. The decoded image of the 
immediately preceding frame . and input image 601 of the 
current frame are input to a motion compensation 
processor 616, and the motion compensation processor 
616 synthesizes the aforesaid interframe predicted 
image. The configuration of this point will be 
described later. 

A predicted image 617 is input to the 
interf rame/intraf rame coding changeover switch 619 
together with a "0" signal 618. This switch changes 
over between interframe coding and intraframe coding by 
selecting either of these inputs. 

When the predicted image 617 is selected (Fig. 
6 shows this case) , interframe coding is performed. 
On the other hand, when a "0 M signal is input, the 
input image is DCT encoded as it is and output to the 
transmission path, so intraframe coding is performed. 

. To obtain a correctly decoded image on the 
receiving side, it is necessary to know whether 
interframe coding or intraframe coding was performed on 
the transmitting side. For this purpose, an 
identifying flag 621 is output to the transmission path. 
Finally, an H.261 encoded bit stream 623 is obtained by 
multiplexing the quantized DCT coefficients, motion 




vectors and identifying flag information in a 
multiplexing unit 622. 

Fig. 7 shows a typical construction of a 
decoder 700 for receiving the encoded bit stream output 
5 by the encoder of Fig. 6. 

A bit stream 717 which is received is split 
into quantized DCT coefficients 701, motion vectors 702 
and intraf rame/interf rame identifying flag 703 by a 
demultiplexer 716 . 

10 The quantized DCT coefficients 701 pass through 

an inverse quantizer 704 and inverse DCT converter 705 
so as to give a differential image 706. This 
differential image is added to an output image 715 of 
an interf rame/intraf rame coding changeover switch 714 

15 in an adder 707, and the result is then output as a 
decoded image' 708. The inter f rame/intraf rame coding 
changeover switch changes over the output depending on 
the interf rame/intraf rame coding identifying flag 703. 
The predicted image 712 used when interframe coding is 

20 performed is synthesized in a predicted image 

synthesizer 711. Here, positions are shifted according 
to the received motion vectors 702 relative to the 
decoded image 710 of the immediately preceding frame 
stored in a frame memory 709. In the case of 

25 intraframe coding, the inter f rame/intraf rame coding 
changeover switch merely outputs a "0" signal 713. 

Fig. 8 shows a typical construction of the 
motion compensation processor 616 of the image . encoding 
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device which uses a global motion compensation scheme 
based on linear interpolation/extrapolation for 
transmitting motion vectors of representative points . 
Numbers which are the same as those of Fig. 6 denote 
5 the same components. Motion estimation relating to 
global motion compensation is performed between the 
decoded image 615 of the immediately preceding frame 
and the original image 601 of the current frame by the 
global motion estimating unit 802, and global motion 
10 compensation parameters (for example, values of the 

aforesaid ua, va, ub, vb, uc, vc, ud, vd) are estimated. 
CI Information 803 about these values is transmitted as 

5^ part of motion information 620. A global motion 

s compensation predicted image 804 is synthesized by a 

f\\ 15 global motion compensation predicted image synthesizer 

808 using equation (3), and is supplied to a block 
*n matching unit 805. Here, motion compensation (motion 

estimation and predicted image synthesis) by block 
matching is performed between the global motion 
20 compensation predicted image and original image of the 
current frame, and block motion vector information 806 
and a final predicted image 617 are thereby obtained. 
This motion vector information is multiplexed with 
motion parameter information in a multiplexer 807, and 
25 output as the motion information 620. 

Fig. 10 shows a typical construction of the 
predicted image synthesizer 711 of Fig. 7. Numbers 
which are the same as those of other diagrams denote 
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the same components. A global motion compensation 
predicted image 804 is synthesized in the global motion 
compensation predicted image synthesizer 808 using the 
global motion compensation parameters 803 extracted 
5 from the motion information 702 in a splitting unit 
1002, relative to the decoded image 710 of the 
immediately preceding frame. The image 804 is supplied 
to a block matching predicted image synthesizer 1001, 

e% and the final predicted image 712 is synthesized using 

10 the block matching motion vector information 806 

4> extracted from the motion information 702. 

e : : 

L! \ 

p Fig. 9 shows another typical construction of 

j;[ the motion compensation processor 616. Numbers which 

5 are the same as those of Fig. 6 denote the same 

fit 15 components. In this example, global motion 

compensation or block matching is applied to each block. 
Motion compensation is performed between the decoded 
image 615 of the immediately preceding frame and the 
original image 601 of the current frame, respectively 
20 by global motion compensation in a global motion 

estimating unit 902 and' global motion compensation 
predicted image synthesizer 911, and by block matching 
in a block matching unit 905. A selection switch 908 
selects the most suitable scheme for every block 
25 between a predicted image 903 due to global motion 
compensation and a predicted image 906 due to block 
matching. Global motion compensation parameters 904, 
motion vectors 907 for each block and selection 
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15 



20 



25 



information 909 relating to global motion 
compensation/block matching are multiplexed by a 
multiplexer 910, and the result is output as the motion 
information 620. 
Su Q^D Vig- 11 shows a typical construction of a 
predictedN image synthesizer 1103 of a decoder which 
decodes th^ bit stream generated by an im^ge^SnSodfng^ 
device usind a motion compensation processor 901 
Numbers which\are the same as those of otTTer diagrams 
denote the same, components . The global motion 
compensation predicted image 903 is synthesized in the 
global motion compensation predicted image synthesizer 
911 using global motion compensation parameters 904 



mi 



ition information 702 in the 



extracted from the 
splitting unit 1002, Relative to the decoded image 710 
of the immediately preceding frame. The block matching 
predicted image 906 is synthesized in the block 
matching predicted image synthesizer 1101 using block 
matching motion vector information 907 extracted from 
the motion information 702 relative to the decoded 
image 710 of the immediately preceding frame. A 

.tofr^/LQ^ sllects either of these schemes 



selection swil 
for each block, i.e., the predicted image 903 due to 
global motion compensation or the predicted image 906 
due to block matching, based on the\ selection 
information 909 extracted from the motion information 
702. After this selection process is >applied to each 
block, the final predicted image 712 iA synthesized. 
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^^^) Wg. 12 shows the structural configuration of 
the global motion compensation predicted image 
synthesized according to this invention. It will be 
assumed thaw: the motion vectors of the corners of the 
compensations image are transmitted as global motion 
parameters. Vhe motion vectors of representative 
points are calculated by equations (9), (10) in a 
computing unit \205 using information 1204 relating to 
motion vectors of the corners of the image. Using- 
information 1206 relating to the motion vectors of 
these representative points, the motion vectors of 
provisional representative points are calculated for 
each line using equation (11) in a computing unit 1207. 
Then, by using . information 1208 relating to the motion 
vectors of these provisional representative points, 
motion vectors for each\pixel are calculated from 
equation (12) in a computing unit 1209. At the same 
time, using information 1^10 relating to the motion 
vectors of each pixel and fthe decoded image 1202 of the 
immediately preceding frame! a global motion 
compensation predicted image\l203 is synthesized and 
output by a processing unit 15J11. 

In addition to a conventional image encoder or 
image decoder using a special circuit or chip, this 
invention may also be applied to a software image 
encoder or software image decoder using a universal 
processor . 
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Fig. 4 and Fig. 5 respectively show examples of 
a software image encoder 4 00 and software image decoder 
500. In the software encoder 400, an input image 401 
is stored in an input frame memory 4 02, and a universal 
processor 403 reads and encodes information from the 
input frame memory 402. The program required for 
driving the universal processor 403 is read from an 
storage device 408 comprising a hard disk or floppy 
disk, etc., and stored in a program memory 404. The 
universal processor encodes the information by using a 
processing memory 405. The encoded information output " 
by the universal processor 403 is then stored in an 
output buffer 406, and output as an encoded bit stream 
407 . 

Fig. 13 shows a flowchart of the encoding 
software which runs on the software encoder shown in 
Fig'. 4 . 

First, image encoding is started in a step 1301, and 0 
is input to a variable N in a step 1302. Next, if the 
value of N is 100, in steps 1303, 1304, 0 is input to N. 
N is a frame number counter which is incremented by 1 
whenever processing of one frame is completed, and it 
can take a value in the range 0-99 when encoding is 
performed. When the value of N is 0, the frame being 
encoded is an I frame (motion compensation is not 
performed, and intraframe coding is performed for all 
blocks), otherwise it is a P frame (a frame comprising 
blocks where motion compensation is performed). This 
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means that if the value of N is 100, one I frame is 
encoded after 99 P frames were encoded. The optimum 
value of N varies according to the performance of the 
encoder and the environment in which the encoder is 
5 used. In this example, the value 100 was used, but the 
value of N is not necessarily limited to 100. The 
determination and output of frame type (I or P) are 
performed in a step 1305. When the value of N is 0, 
£i ' I ? is output as frame type identifying information to 

4; 10 the output buffer, and thereafter, the frames wherein 
l\\ coding is performed will be I frames. Herein, the 

Jj;j expression "output to output buffer 11 means that part of 

£;! the bit stream is output from the encoder to external 

devices after storing in the output buffer (406 in Fig. 

s 

[J! 15 4). When N is not 0, f P f is output by the output 

i 'I 

fji buffer as frame type identifying information, and 

thereafter, the frames wherein coding is performed will 
be P frames. 

In a step 1306, the input image is stored in a frame 
20 memory A. The frame memory A described here denotes 
part of the memory area of the software encoder (for 
example, this memory area is reserved in a memory 405 
of Fig. 4). In a step 1307, it is determined whether 
the frame currently being encoded is an I frame. If it 
25 is not an I frame, motion estimation/motion 
compensation is performed in a step 1308. 

Fig. 14 shows the detailed processing which is 
performed in this step 1308. First, global motion 
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estimation is performed between the images stored in 
the frame memories A and B (the decoded image of the 
immediately preceding frame is stored in a frame memory 
B) in a step 1401, and the motion vectors of the 
5 corners of the image are output as global motion 

parameters by the output buffer. In a step 1402, the 
motion vectors of representative points are calculated 
using the motion vectors of the corners of this image 
ri by equations (9), (10). Next, in a step 1403, 0 is 

"I;! 10 input to a variable M. M represents the number of 
4* lines in the image. When M is 0, it means that the 

g : \ uppermost line of the image is being processed, and 

~|: when M is a value obtained by subtracting 1 from the 

s number of lines in the image, it means that the 

flf 15 lowermost line of the image is being processed. 'By 

«!: using the motion vectors of representative points 

! if 

4l calculated in the step 1402, the motion vectors of 

provisional representative points on the Mth line are 
calculated by equation (11) in a step 1404. Then, 

20 making use of the motion vectors of these provisional 
representative points, the motion vectors of all the 
pixels in the Mth line are calculated by equation (12), 
and the Mth line of the global motion compensation 
predicted image is synthesized using the decoded image 

25 of the immediately preceding frame which is stored in 
the frame memory B according to the calculated motion 
vectors in a step 1405. In a step 1406, 1 is added to 
the value of M. In a step 1407, if the value of M is 
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equal to the number of lines in the image, the routine 
proceeds to a step 1408, and if it is not equal, the 
routine proceeds to the step 1404. When the processing 
of the step 1408 starts, the image due to global motion 
5 compensation is stored in a frame memory D. In the 

steps after the step 1408, block matching is performed. 
First, in the step 1408, motion estimation for every 
block is performed between the frame memory F and frame 
memory A (input image) , the motion vectors of each 

10 block are calculated, and these motion vectors are 

output to the output buffer. Next, a predicted image 
is synthesized by block matching in a step 1409 using 
the motion vectors and the image stored in the frame 
memory F, and this is stored in a frame memory C as a 

15' final predicted image. In a step 1410, a differential 
image of the frame memories A and C is found, and this 
is stored in the frame memory A. 

I^eturning now to Fig. 13, immediately before 
the processv in the step 1308 is started, when the 

20 current framevis an I frame, the input image is stored 
in the frame metaory A, and when the current frame is a 
P frame, a differential image between the input image 
and predicted image\is stored in the frame memory A. 
In the step^^^8(^C'X s is applied to the image stored in 

25 this frame memory A, anti the DCT coefficients 

calculated here are output to the output buffer after 
being quantized. Further, Yn a step 1310, inverse 
quantization and inverse DCT\are applied to these 
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quantised DCT coefficients, and the image obtained as a 
result \s stored in the frame memory B. Next, it is 
again de-oermined whether the current frame is an I 
frame, ancnwhen the image is not an I frame, the images 
5 in the frames memories B and C are added in a step 1312, 
and this result is stored in the frame memory B. • .Here, 
the encoding ©f one frame is finished, and the image 
stored in the frame memory B immediately before 
processing of a\step 1313 is performed is a 

10 reconstracted image of the frame for which encoding has 
just been completed (same as that obtained on the 
decoding side) . In\ the step 1313, it is determined 
whether the frame fbr which coding is complete is the 
last frame, and if At is the last frame, coding is 

15 terminated. When it YLs not the last frame, 1 is added 
to N in a step 1314, Vhe routine returns to the step 
1303 again, and encoding of the next frame is started. 
It will be understood tViat although the flowchart 
described here relates bo a method of applying block 

20 matching to the global mbtion compensation predicted 
image synthesized as a result of performing global 
motion compensation (method corresponding to a device 
using a motion compensation processor 801 of Fig. 8), a 
flowchart relating to a metnod of performing global 

25 motion compensation and global matching in parallel 
(method corresponding to a device using a motion 
compensation processor 901 of\Fig. 9) can be prepared 
by making a slight modif icatioVi . 
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On the other hand, in the software decoder 500, 
an input encoded bit stream 501 is first stored in an 
input buffer 502, and read by a universal processor 503. 
The universal processor 503 decodes the information 
5 using a program memory 504 for storing a program read 
from an storage device 508 comprising a hard disk or 
floppy disk, etc., and a processing memory 505. The 
decoded image obtained is then stored in an output 

CJ 

y;i frame memory 506, and output as an output image 507. 

10 Fig. 15 shows a flowchart of decoding software 

which runs on a software decoding device shown in Fig. 
£l| 5. Processing is started in 1501, and in a step 1502, 

v * it is determined whether or not there is input 

3 

j^; information. Here, if there is no input information, 

n i 

£:} 15 decoding is terminated in a step 1503. When there is 

fil 

input information, frame type information is first 
~* input in a step 1504. The term "input" means that. 

information stored in an input buffer 502 is read. In 
a step 1505, it is determined whether or not the read 

20 frame type information is T I f . When it is not 'I 1 , 

predicted image synthesis is performed in a step 1506. 
The details of the processing performed in this step 
1506 is shown in Fig. 16. 

First, in a step 1601, the motion vectors of 

25 the corners of the image are input. In a step 1602, 
the motion vectors of representative points are 
calculated by equations (9), (10) using the .motion 
vectors of the corners of this image. Next, in a step 

4 4 



1603, 0 is input to the variable M. M represents the 
number of lines in the image. When M is zero, it means 
that the uppermost line of the image is being processed, 
and when M is a value obtai-ned by subtracting 1 from 
the number of lines of the image, it means that the 
lowermost line of the image is being processed. Using 
the motion vectors of representative points calculated 
in the step 1602, the motion vectors of provisional 
representative points on the Mth line are calculated by 
equation (11) in a step 1604. The motion vectors for 
all the pixels in the Mth line are calculated by 
equation (12) in a step 1605. From the calculated 
motion vectors, the Mth line of the global motion 
compensation predicted image is synthesized using the 
decoded image of the immediately preceding frame stored 
in a frame memory E, and this is stored in a frame 
memory G. The memory G herein means part of the area 

of the memory 505 of the software decoder. 
In a step 1606, 1 is added to the value of M. If the 
value of M is equal to the number of lines of the image 
in a step 1607, the routine proceeds to a step 1608, 
and if it is not equal, the routine shifts to the step 

1604. When the processing of the step 1608 is started, 
the predicted image due to global motion compensation 
is stored in the frame memory G. In the step 1608, 
block matching is performed. Motion vector information 
for each block is input, the predicted image due to 
block matching is synthesized using these motion 



vectors and the image stored in the frame memory G, and 
this predicted image is stored in the frame memory D. 

Returning to Fig. 15, quantized DCT 
coefficients are input in a step 1507, and the image 
obtained by applying inverse quantization and inverse 
DCT to these is stored in the frame memory E. In a 
step 1508, it is determined whether or not the frame 
currently being decoded is an I frame. When it is not 
an I frame, the images stored in the frame memories D 
and Eare added in a step 1509, and the resulting image 
is stored in the frame memory E. The image stored in 
the frame memory E immediately prior to performing the 
processing of the step 1510 is the reproduced image. 
In the step 1510, the image stored by this frame memory 
E is output to the output frame memory 506, and output 
from the decoder as an output image without 
modification. When decoding of one frame is -completed 
in this way, processing returns again to the step 1502. 

When the software image encoder and software 
image decoder shown in Fig. 4 and Fig. 5 are made to 
execute a program implementing the method of 
synthesizing interframe predicted images described in 
this specification, global motion compensation or 
warping prediction can be performed with a smaller 
amount of computation. Compared to the case when this 
invention is not used, therefore, power consumption is 
reduced, devices are less costly to manufacture, images 
with more pixels can be processed in real time, and 



simultaneous parallel processing can be performed 
including processing other than encoding and decoding. 
Moreover, by using the algorithm shown in this 
specification, compressed image data which could not be 
reproduced in real time due to limitation of the 
computing ability of conventional encoders and decoders, 
can now be reproduced in real time. 

The embodiments of this invention described 
above further comprise the following embodiments. 

(1) In conventional image encoding, error coding is 
performed using discrete cosine transformation or the 
like after interframe prediction, however this 
invention may be used for image encoding or decoding 
when the interframe predicted image is used as the 
reconstructed image without modification. 

(2) In the above description, it was assumed that the 
shape of the image was rectangular, however the 
invention may be applied equally well to images having 
any arbitrary shape other than rectangular. In this 
case, the processing of the invention may first be 
applied to a rectangle enclosing an image of arbitrary 
shape, and a computation performed to calculate motion 
vectors only of pixels in the image of arbitrary shape. 

(3) In the above specification, a motion vector 
interpolation/extrapolation algorithm was described 



using two step processing wherein the value of p or q 
was a non-negative integer power of 2. However this 
two-step processing algorithm also has the effect of 
reducing the numerator of a division even when p and q 
are not non-negative integer powers of 2, and it is 
therefore effective for preventing overflow of 
registers . 

Field of the Invention 

Fig. 17 shows specific examples of an encoding/ 
decoding device using the prediction image synthesis 
method shown in this specification. 

(a) shows a case where an image coding/decoding 
software installed in a personal computer 1701 is used 
as the image encoding/decoding device. This software 
is recorded on some type of storage medium (CD-ROM, 
floppy disk or hard disk, etc.,), and read by the 
personal computer. Further, by connecting this 
personal computer to a communication line, the device 
can be used as an image communication terminal. 

(b) shows a case where a coded bit stream 
comprising moving image information encoded by the 
method of this invention and recorded on a storage 
medium 1702 is read and reconstructed by a reproducing 
device 1703 comprising a device according to this 
invention, and the reconstructed video signal is 
displayed on a television monitor 1704. The 
reproducing device 1703 may also simply read the coded 
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bit stream, the decoding device being built into a 
television monitor 1704. 

(c) shows a case wherein the decoding device of 
this invention is built into a television receiver 1705 
for digital broadcasting. 

(d) shows a case wherein the decoder is built 
into a set-top box 1709 connected to a cable TV cable 
1708 or a satellite/terrestrial wave broadcasting 
antenna, and the image is reproduced on a television 
monitor 1710. 

Instead of the set-top box, the decoder may also be 
built into the television monitor as in the case of 
1704 of (b) . 

(e) shows a case where the encoder /decoder of 
this' invention is built into a digital portable 
terminal 1706. The digital portable terminal may be a 
transmitting/receiving terminal comprising an 
encoder/decoder, a transmitting terminal only with an 
encoder, or a receiving terminal only with a decoder. 

(f) shows a case where the encoder is built 
into a camera 1707 for photographing moving images. 
The camera 1707 may simply acquire a video signal, and 
the signal be input to a special encoder 1711. In any 
of the devices or systems shown in the figure, the 
method described in this specification permits 
simplification of the device as compared with the case 
when prior art technology is used. 



